Boosting for multiclass semi-supervised learning
نویسندگان
چکیده
Supervised learning methods are effective when there are sufficient labeled instances. In many applications, such as object detection, document and web-page categorization, labeled instances however are difficult, expensive, or time consuming to obtain because they require empirical research or experienced human annotators. Semi-supervised learning algorithms use not only the labeled data but also the unlabeled data to build a classifier. The goal of semi-supervised learning is to use unlabeled instances and combine the information in the unlabeled examples with the explicit classification information of labeled examples for improving the classification performance. Most of the semi-supervised learning algorithms were designed for binary classification problems. However, many practical domains, for example recognition of speech, objects, and characters, involve more than two classes. A Multiclass classification problem can be decomposed into a number of independent binary classification problems by utilizing methods like one-versus-all. However, these schemes have their problems. One-versus-all results in imbalanced distributions. Since each classifier is trained independently, the weights of their outputs may be on different scales, so that combining them is non-trivial. There is thus a need for direct multiclass algorithm for semi-supervised learning. In this paper we propose a new algorithm for Multiclass semi-supervised learning that follows the boosting approach and is a direct generalization of the binary SemiBoost algorithm [3], which uses both the similarity between the points and the classifier predictions to sample and assign “pseudo-labels” to the unlabeled examples, to the multiclass setting, named as Multiclass SemiBoost. The key advantage of Multiclass SemiBoost is to exploit both the manifold and the cluster assumption to train the classifiers using boosting. We derive the algorithm from an objective function that combines empirical loss on the labeled data and inconsistency of
منابع مشابه
Multiclass Semi-supervised Boosting Using Different Distance Metrics
The goal of this thesis project is to build an effective multiclass classifier which can be trained with a small amount of labeled data and a large pool of unlabeled data by applying semi-supervised learning in a boosting framework. Boosting refers to a general method of producing a very accurate classifier by combining rough and moderately inaccurate classifiers. It has attracted a significant...
متن کاملSemi-Supervised Boosting for Multi-Class Classification
Most semi-supervised learning algorithms have been designed for binary classification, and are extended to multi-class classification by approaches such as one-against-the-rest. The main shortcoming of these approaches is that they are unable to exploit the fact that each example is only assigned to one class. Additional problems with extending semisupervised binary classifiers to multi-class p...
متن کاملManifoldBoost: Stagewise Function Approximation for Fully-, Semi- and Un-supervised Learning
We introduce a boosting framework to solve a classification problem with added manifold and ambient regularization costs. It allows for a natural extension of boosting into both semisupervised problems and unsupervised problems. The augmented cost is minimized in a greedy, stagewise functional minimization procedure as in GradientBoost. Our method provides insights into generalization issues in...
متن کاملTracking-Based Semi-Supervised Learning
In this paper, we consider a semi-supervised approach to the problem of track classification in dense 3D range data. This problem involves the classification of objects that have been segmented and tracked without the use of a class-specific tracker. We propose a method based on the EM algorithm: iteratively 1) train a classifier, and 2) extract useful training examples from unlabeled data by e...
متن کاملUvA - DARE ( Digital Academic Repository ) Boosting for Multiclass Semi - Supervised Learning
Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 37 شماره
صفحات -
تاریخ انتشار 2014